28 research outputs found
Fast filtering and animation of large dynamic networks
Detecting and visualizing what are the most relevant changes in an evolving
network is an open challenge in several domains. We present a fast algorithm
that filters subsets of the strongest nodes and edges representing an evolving
weighted graph and visualize it by either creating a movie, or by streaming it
to an interactive network visualization tool. The algorithm is an approximation
of exponential sliding time-window that scales linearly with the number of
interactions. We compare the algorithm against rectangular and exponential
sliding time-window methods. Our network filtering algorithm: i) captures
persistent trends in the structure of dynamic weighted networks, ii) smoothens
transitions between the snapshots of dynamic network, and iii) uses limited
memory and processor time. The algorithm is publicly available as open-source
software.Comment: 6 figures, 2 table
Distinguishing Topical and Social Groups Based on Common Identity and Bond Theory
Social groups play a crucial role in social media platforms because they form
the basis for user participation and engagement. Groups are created explicitly
by members of the community, but also form organically as members interact. Due
to their importance, they have been studied widely (e.g., community detection,
evolution, activity, etc.). One of the key questions for understanding how such
groups evolve is whether there are different types of groups and how they
differ. In Sociology, theories have been proposed to help explain how such
groups form. In particular, the common identity and common bond theory states
that people join groups based on identity (i.e., interest in the topics
discussed) or bond attachment (i.e., social relationships). The theory has been
applied qualitatively to small groups to classify them as either topical or
social. We use the identity and bond theory to define a set of features to
classify groups into those two categories. Using a dataset from Flickr, we
extract user-defined groups and automatically-detected groups, obtained from a
community detection algorithm. We discuss the process of manual labeling of
groups into social or topical and present results of predicting the group label
based on the defined features. We directly validate the predictions of the
theory showing that the metrics are able to forecast the group type with high
accuracy. In addition, we present a comparison between declared and detected
groups along topicality and sociality dimensions.Comment: 10 pages, 6 figures, 2 table
Resilience of Supervised Learning Algorithms to Discriminatory Data Perturbations
Discrimination is a focal concern in supervised learning algorithms
augmenting human decision-making. These systems are trained using historical
data, which may have been tainted by discrimination, and may learn biases
against the protected groups. An important question is how to train models
without propagating discrimination. In this study, we i) define and model
discrimination as perturbations of a data-generating process and show how
discrimination can be induced via attributes correlated with the protected
attributes; ii) introduce a measure of resilience of a supervised learning
algorithm to potentially discriminatory data perturbations, iii) propose a
novel supervised learning algorithm that inhibits discrimination, and iv) show
that it is more resilient to discriminatory perturbations in synthetic and
real-world datasets than state-of-the-art learning algorithms. The proposed
method can be used with general supervised learning algorithms and avoids
inducement of discrimination, while maximizing model accuracy.Comment: 17 pages, 10 figures, 1 tabl
Estimating community feedback effect on topic choice in social media with predictive modeling
Social media users post content on various topics. A defining feature of social media is that other users can provide feedback—called community feedback—to their content in the form of comments, replies, and retweets. We hypothesize that the amount of received feedback influences the choice of topics on which a social media user posts. However, it is challenging to test this hypothesis as user heterogeneity and external confounders complicate measuring the feedback effect. Here, we investigate this hypothesis with a predictive approach based on an interpretable model of an author’s decision to continue the topic of their previous post. We explore the confounding factors, including author’s topic preferences and unobserved external factors such as news and social events, by optimizing the predictive accuracy. This approach enables us to identify which users are susceptible to community feedback. Overall, we find that 33% and 14% of active users in Reddit and Twitter, respectively, are influenced by community feedback. The model suggests that this feedback alters the probability of topic continuation up to 14%, depending on the user and the amount of feedback
Global news synchrony and diversity during the start of the COVID-19 pandemic
News coverage profoundly affects how countries and individuals behave in international relations. Yet, we have little empirical evidence of how news coverage varies across countries. To enable studies of global news coverage, we develop an efficient computational methodology that comprises three components: (i) a transformer model to estimate multilingual news similarity; (ii) a global event identification system that clusters news based on a similarity network of news articles; and (iii) measures of news synchrony across countries and news diversity within a country, based on country-specific distributions of news coverage of the global events. Each component achieves state-of-the art performance, scaling seamlessly to massive datasets of millions of news articles.
We apply the methodology to 60 million news articles published globally between January 1 and June 30, 2020, across 124 countries and 10 languages, detecting 4357 news events. We identify the factors explaining diversity and synchrony of news coverage across countries. Our study reveals that news media tend to cover a more diverse set of events in countries with larger Internet penetration, more official languages, larger religious diversity, higher economic inequality, and larger populations. Coverage of news events is more synchronized between countries that not only actively participate in commercial and political relations---such as, pairs of countries with high bilateral trade volume, and countries that belong to the NATO military alliance or BRICS group of major emerging economies---but also countries that share certain traits: an official language, high GDP, and high democracy indices
Social features of online networks: the strength of intermediary ties in online social media
An increasing fraction of today social interactions occur using online social
media as communication channels. Recent worldwide events, such as social
movements in Spain or revolts in the Middle East, highlight their capacity to
boost people coordination. Online networks display in general a rich internal
structure where users can choose among different types and intensity of
interactions. Despite of this, there are still open questions regarding the
social value of online interactions. For example, the existence of users with
millions of online friends sheds doubts on the relevance of these relations. In
this work, we focus on Twitter, one of the most popular online social networks,
and find that the network formed by the basic type of connections is organized
in groups. The activity of the users conforms to the landscape determined by
such groups. Furthermore, Twitter's distinction between different types of
interactions allows us to establish a parallelism between online and offline
social networks: personal interactions are more likely to occur on internal
links to the groups (the weakness of strong ties), events transmitting new
information go preferentially through links connecting different groups (the
strength of weak ties) or even more through links connecting to users belonging
to several groups that act as brokers (the strength of intermediary ties).Comment: 14 pages, 18 figure
Distinguishing between Topical and Non-Topical Information Diffusion Mechanisms in Social Media
A number of recent studies of information diffusion in social media, both
empirical and theoretical, have been inspired by viral propagation models
derived from epidemiology. These studies model the propagation of memes, i.e.,
pieces of information, between users in a social network similarly to the way
diseases spread in human society. Importantly, one would expect a meme to
spread in a social network amongst the people who are interested in the topic
of that meme. Yet, the importance of topicality for information diffusion has
been less explored in the literature.
Here, we study empirical data about two different types of memes (hashtags
and URLs) spreading through the Twitter's online social network. For every
meme, we infer its topics and for every user, we infer her topical interests.
To analyze the impact of such topics on the propagation of memes, we introduce
a novel theoretical framework of information diffusion. Our analysis identifies
two distinct mechanisms, namely topical and non-topical, of information
diffusion. The non-topical information diffusion resembles disease spreading as
in simple contagion. In contrast, the topical information diffusion happens
between users who are topically aligned with the information and has
characteristics of complex contagion. Non-topical memes spread broadly among
all users and end up being relatively popular. Topical memes spread narrowly
among users who have interests topically aligned with them and are diffused
more readily after multiple exposures. Our results show that the topicality of
memes and users' interests are essential for understanding and predicting
information diffusion.Comment: Accepted to ICWSM'16, 10 pages, 8 figures, 2 tables, ICWSM'16:
Proceedings of the 10th International AAAI Conference on Web and Social Medi
Entangling Mobility and Interactions in Social Media
International audienceDaily interactions naturally define social circles. Individuals tend to be friends with the people they spend time with and they choose to spend time with their friends, inextricably entangling physical location and social relationships. As a result, it is possible to predict not only someone's location from their friends' locations but also friendship from spatial and temporal co-occurrence. While several models have been developed to separately describe mobility and the evolution of social networks, there is a lack of studies coupling social interactions and mobility. In this work, we introduce a model that bridges this gap by explicitly considering the feedback of mobility on the formation of social ties. Data coming from three online social networks (Twitter, Gowalla and Brightkite) is used for validation. Our model reproduces various topological and physical properties of the networks not captured by models uncoupling mobility and social interactions such as: i) the total size of the connected components, ii) the distance distribution between connected users, iii) the dependence of the reciprocity on the distance, iv) the variation of the social overlap and the clustering with the distance. Besides numerical simulations, a mean-field approach is also used to study analytically the main statistical features of the networks generated by a simplified version of our model. The robustness of the results to changes in the model parameters is explored, finding that a balance between friend visits and long-range random connections is essential to reproduce the geographical features of the empirical networks